SCALING LIMITS OF MARKOV BRANCHING TREES AND 
GALTON- WATSON TREES CONDITIONED ON THE NUMBER OF 
VERTICES WITH OUT-DEGREE IN A GIVEN SET 



DOUGLAS RIZZOLO 

Abstract. We generalize recent results of Haas and Miermont in |7j to obtain scaling limits 
of Markov branching trees whose size is specified by the number of nodes whose out-degree 
lies in a given set. We then show that this implies that the scaling limit of finite variance 
Galton- Watson trees condition on the number of nodes whose out-degree lies in a given set 
is the Brownian continuum random tree. The key to this is a generalization of the classical 
Otter-Dwass formula. 



1. Introduction 

Recently there has been considerable interest in the literature in studying the asymptotic 
properties of random trees. Much of the focus has been on limits of trees satisfying either 
nice consistency relations between trees of various sizes (|H]) or having a nice encoding 
as continuous functions on [0, cxd) (see [H] for an overview). Particular interest has been 
focused on limits of Galton- Watson trees conditioned on the total number of vertices. The 
standard techniques for proving the convergence of Galton- Watson trees to continuum trees 
are intimately connected with contour processes of these trees. However, conditioning on 
a subset of the vertices produces significantly more complicated contour processes than 
conditioning on the number of vertices so we must take a different approach. Techniques for 
handling Markov branching trees whose size is their number of leaves were developed in [7] 
and we will generalize this approach to Markov branching trees whose size is their number 
of vertices whose out-degree falls in a given set. 

Our main result, the notation for which will be fully defined later, is the following theorem. 

Theorem 1. Let T be a critical Galton- Watson tree with offspring distribution ^ such that 
< = Var(^) < oo and let A C {0, 1, 2, ... } contain 0. Suppose that for sufficiently large 
n the probability that T has exactly n vertices with out-degree in A is positive, and for such 
n let be T conditioned to have exactly n vertices with out-degree in A, considered as a 
rooted unordered tree with edge lengths 1 and the uniform probability distribution fid^^T^ on 
its vertices with out- degree in A. Then 

^ rpA d. rp 

where the convergence is with respect to the rooted Gromov-Hausdorff-Prokhorov topology 
and Ti/2,U2 is the Brownian continuum random tree. 
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In the case A = Z"*" = {0,1,2,...} we recover the classical result about the scaling limit 
of a Galton- Watson tree conditioned on its number of vertices first obtained in [1]. For 
other choices of A the result appears to be new. The condition that for sufficiently large 
n the probability that T has exactly n vertices with out-degree in A is positive is purely 
technical and could be dispensed with at the cost of chasing a constant (which may appear 
in the limit) through our computations. In addition to generalizing the results of [7], the key 
to proving this theorem is a generalization of the classical Otter-Dwass formula, which we 
prove in Section [STTl The Otter-Dwass formula (see |T^) has been an essential tool in several 
proofs that the Brownian continuum random tree is the scaling limit of Galton- Watson trees 
conditioned on their number of vertices, including the original proof in [Ij as well as newer 
proofs in fTO] and [7]. While we follow the approach in |7], our generalization of Otter-Dwass 
formula should allow for proofs along the lines of [1] and [lOj as well. Furthermore, with 
our results here, it should be straight forward to prove the analogous theorem in the infinite 
variance case using the approach in [7]. 

This paper is organized as follows. Section [2] introduces our basic notation, as well as 
the Markov branching trees and continuum trees that will arise for us as scaling limits. It 
concludes with our generalization of the scaling limits in [7]. Section [3] is devoted to our 
study of Galton- Watson trees. We begin by proving our generalization of the Otter-Dwass 
formula and we then use this to analyze the asymptotics of the partition at the root of a 
Galton- Watson tree. Bringing this all together, we finish with the proof of Theorem [H 

Acknowledgment. I would like to thank my adviser, Jim Pitman, for suggesting the prob- 
lem that led to this paper and for helpful conversations and feedback along the way. 

2. Models of trees 

2.1. Basic notation. Fix a countably infinite set S; we will consider the vertex sets of all 
graphs discussed to be subsets of S. A rooted ordered tree is a finite acyclic graph t with 
a distinguished vertex called the root and such that, if f is a vertex of t, the set of vertices 
in t that are both adjacent to v and further from the root than v with respect to the graph 
metric is linearly ordered. Let be the set of all rooted ordered trees whose vertex set is 
a subset of S. For t, s G , define t ~ s if and only if there is a root and order preserving 
isomorphism from t to s and let T = / ~ be the set of rooted ordered trees considered up 
to root and order preserving isomorphism. By a similar construction, we let be the set of 
rooted unordered trees considered up to root preserving isomorphism. If t is in T or and 
f G t is a vertex, the out-degree of v is the number of vertices in t that are both adjacent 
to V and further from the root than v with respect to the graph metric. The out-degree of 
V will simply be denoted by deg(f), since we will only ever discuss out-degrees. Fix a set 
A C Z"^ such that E A and for t in T or define to be the number of vertices in t 
whose out-degree is in A. Furthermore, we define T^.n and by 

rA,n = {teT:i^At = n} and T^^ = {t G T" : = n}. 

2.2. Markov branching trees. In this section we extend the notion of Markov branching 
trees developed in [7J , where Markov branching trees were constructed separately in the cases 
A = {0} and A = Z"*". Here we give a construction for general A such that E A. Let Pn 
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be the set of partitions of n and, for A G Vn-, let p(A) be the number of nonzero blocks in A 
and mj{X) the number of blocks in A equal to j. For convenience, we take Vi = {0, (1)} and 
define p(0) = — 1. Define Vi = Vi and for n >2, define by 

= {XeVn: p(A) ^A}u{\e Vn-i ■■ p(A) g A}. 

Let (nfc) be an increasing sequence of integers. A sequence {qnk)k>i, such that is 
a probability measure on P;^^, is called compatible if for each k, qn^. is concentrated on 
partitions A = (Ai,...,Ap) such that is defined for all i. Suppose further that q\ is 
defined, qn^ii^k)) < 1 if 1 ^ v4 and g2((l)) = 1 if 1 G A. Our goal is to construct a sequence 
of laws (P^^)fcLi such that P^^ is a law on 7^„j. and such that the subtrees above a vertex 
are conditionally independent given the degree of that vertex. 

Define to be the law of the path with with a root attached to a leaf by a path with G 
edges where G = if 1 G A and has the geometric distribution P(G = j) = gi(0)(l — gi(0))-^, 
j > if 1 ^ A. For k > 2, P^^ is defined as follows: Choose A G V^^ \ {(^fe)} according to 
?nfe('l^nfc \ {(^fc)}) independently choose G' with G' = 1 if 1 G A and G' has a geometric 
distribution 

P(G' = j) = (1 - qn,{M))qn,{{n,)y-\j > 1, 
if 1 ^ y4. Let (Ti, T2, . . . , Tp(A)) be a vector of trees, independent of G', such that the Tj are 
independent and Tj has distribution P^. and let T be the tree that results from attaching 
the roots of the Tj to the same new vertex and then if G' = 1 call this vertex the root, and 
otherwise attach that vertex to a new root by a path with G' — 1 edges. P^^ is defined to 
be the law of T. An easy induction shows that P^^ is concentrated on the set of unordered 
rooted trees with exactly vertices whose out-degree is in A. 

To connect with if (n^) = (1,2,3,...), the case A = {0} corresponds to the P^ 
defined in [7J and the case A = Z+ corresponds to the defined in [7]. Other choices of A 
interpolate between these two extremes. A sequence (T„j.)fc>i such that for each k, T^^. has 
law P^^ for some choice of A and q (independent of n) is called a Markov branching family. 
For ease of notation, we will generally drop the subscript k and it will be implicit that we 
are only considering n for which the quantities discussed are defined. 

2.3. Trees as metric measure spaces. The trees we have been talking about can naturally 
be considered as metric spaces with the graph metric. That is, the distance between to 
vertices is the number of edges on the path connecting them. Let (t, d) be a tree equipped 
with the graph metric. For a > 0, we define at to be the metric space (t, ad), i.e. the metric 
is scaled by a. This is equivalent to saying the edges have length a rather than length 1 in 
the definition of the graph metric. More, generally we can attach a positive length to each 
edge in t and use these in the definition of the graph metric. Moreover, the trees we are 
dealing with are rooted so we consider (t, d) as a pointed metric space with the root as the 
point. Moreover, we are concerned with the vertices whose out-degree is in A, so we attach 
a measure l^dAti which is the uniform probability measure on dAt = {f G t : deg(f ) G A}. 
If we have a random tree T, this gives rise to a random pointed metric measure space 
(T, (i, root, f^dAT)- To make this last concept rigorous, we need to put a topology on pointed 
metric measure spaces. This is hard to do in general, but note that the pointed metric 
measure spaces that come from the trees we are discussing are compact. 
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Let A^^„ be the set of equivalence classes of compact pointed metric measure spaces 
(equivalence here being up to point and measure preserving isometry). It is worth point- 
ing out that Mw actually is a set in the sense of ZFC, though this takes some work to 
show. We metrize A4w with the pointed Gromov-Hausdorff-Prokhorov metric (see [7]). Fix 
(X, d, p, fi), (X', d', p, p') G M.W and define 

ciGHp(X, X') = inf inf [5(0(p), 0'(p')) V 5^,(0(X), 0'(X')) V 5p(0,p, 0^ )] , 

(M,&) 4>-x^M 

4>l:X'^M 

where the first infimum is over metric spaces (M, 5), the second infimum if over isometric 
embeddings and 0' of X and X' into M, 5h is the Hausdorff distance on compact subsets 
of M, and 5p is the Prokhorov distance between the pushforward 0*/i of p by and the 
pushforward (p'^p' of p' by 0'. Again, the definition of this metric has potential to run into 
set-theoretic difficulties, but they are not terribly difficult to resolve. 

Proposition 1 (Proposition 1 in [7]). The space {A4w, dcHp) is a complete separable metric 
space. 

An M-tree is a complete metric space (T, d) with the following properties: 

• For v,w & T, there exists a unique isometry 0„^^ : [0,d{v,w)] with 0„,^(O) = v to 
4>v,w{d{v,w)) = w. 

• For every continuous injective function c : [0, 1] — )■ T such that c(0) = v and c(l) = w, 
we have c([0,l]) = (j)y^w{['^,d{v,w)]). 

If (T, d) is an M-tree, every choice of root p & T and probability measure p on T yields 
an element (T, d, p, p) of A^^. With this choice of root also comes a height function ht(f ) = 
d{v, p). The leaves of T can then be defined as a point f G T such that v is not in [[p, w[[:= 
0p,m;([O, h.t{w))) for any w G T. The set of leaves is denoted C{T). 

Definition 1. A continuum tree is an M-tree {T,d,p,p) with a choice of root and probability 
measure such that p is non-atomic, p{C{T)) = 1, and for every non-leaf vertex w , p{v G T : 

[[p,v]]n[[p,w]] = [[p,w]]}>o. 

The last condition says that there is a positive mass of leaves above every non-leaf vertex. 
We will usually just refer to a continuum tree T, leaving the metric, root, and measure as 
implicit. A continuum random tree (CRT) is an {AiwydcHp) valued random variable that 
is almost surely a continuum tree. The continuum random trees we will be interested in are 
those associated with self-similar fragmentation processes. 

2.4. Self-similar fragmentations. For any set B, let Vb be the set of countable partitions 
of B, i.e. countable collections of disjoint sets whose union is B. For n G N := N U {oo}, let 
"Pn '■= 'P[n]- Suppose that n = (tti, 7r2, . . . ) G "Pn, (here and throughout we index the blocks of 
vr in increasing order of their least elements), and B C N. Define the restriction of vr to i?, 
denoted by tt^b ot irCiB, to be the partition of [n] fl B whose elements are the blocks ttj fl B, 
i > 1. We topologize Vn by the metric 



inf{i : Tcn[i]^ an [i]}' 
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It is worth noting that this is, in fact, an ultra- metric, i.e. for vr^, vr^, vr^ G P„, we have 

d(7r\ vr^) < max((i(7r\ vr^), (i(7r^, vr^)). 
Note that (P„, d) is compact for all n. 

Definition 2 (Definition 3.1 in [3J). Consider two blocks i? C 5' C N. Let it be a partition 
of B with jj^Ti = n non-empty blocks (n = oo is allowed), and vr*^') = {vr'^^^i = 1, ■ ■ ■ ,n} be 
a sequence in Vb'- For every integer i, we consider the partition of the i-th block TTj o/ tt 
induced by the i-th term vr*^*) of the sequence 'k'^'\ that is, 



1=1 



As i varies in [n], the collection |7rj*^ fl tTj : i, j G n| forms a partition of B, which we denote 
by Frag(7r, vr^')) and call the fragmentation of ti by n^'K 

It is relatively easy to show that Frag is Lipschitz continuous in the first variable, and 
continuous in an appropriate sense in the second. It is also relatively easy to show that if tt 
is an exchangeable partition and tc^^^ is a sequence of independent exchangeable partitions 
(also independent of vr), then vr and Frag(7r, tt*^'^) are jointly exchangeable. See chapter 3 of 
[3j for both of these facts. We will use the Frag function to define the transition kernels of 
our fragmentation processes. 

Define 

5^ = I (si, S2, . . . ) : si > S2 > • ■ ■ > 0, ^ Si < l| , 

and 

Si = |(si,S2,...) G [0,lf I ^ 1 

and endow both with the topology they inherit as subsets of [0, 1]^ with the product topology. 
Observe that S-^ and Si are compact. For a partition tt G Poo; we define the asymptotic 
frequency |7rj| of the i'th block by 

II v ^ N 

TTjl = lim , 

n— >-oo n 

provided this limit exists. If all of the blocks of vr have asymptotic frequencies, we define 
|7r| G Si by |7r| = (|7ri|, Ivrs], . . . ). 

Definition 3 (Definition 3.3 in [3J). Let Il(t) be an exchangeable, cddldg Voo-valued process 
such that n(0) = 1^ := (N, 0, . . . ) such that 

(1) n(t) almost surely possesses asymptotic frequencies |n(t)| simultaneously for allt > 
and 

(2) if we denote by Bi(t) the block ofIl{t) which contains i, then the process t ^ \Bi{t)\ 
has right- continuous paths. 
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We call n a self-similar fragmentation process with index a G M if and only if, for every 
t,t' > 0, the conditional distribution ofIl{t + t') given J-'t is that of the law o/ Frag(7r, vr*^')), 
where n = n(t) and vr^'^ = (vr^^^-i G N) is a family of independent random partitions such 
that for i G N, vr*^*^ has the same distribution as Il{t'\7ri\"). 

The existence of these processes is non-trivial, but they do in fact exist. This makes sense 
even for a < and, in this case, one can show that for any t > the set of singletons of n(t) 
has positive asymptotic frequency. See corollary 3.2 in [3J. 

One important tool for studying self-similar fragmentations is the equivalent of the Levy- 
Ito decomposition of Levy processes. Suppose, for the moment, that 11 is a self-similar 
fragtmentation process with a = (these are also called homogeneous fragmentations). In 
this case, it turns out that 11 is a Feller process as is n|[„] for every n. Thus it is natural to 
try to identify the jump rates of these processes. For vr G P„ \ {![„]}, let 

= lim P(n|[„](t) = 7r). 
t-s>0+ 

By exchangeability, it is obvious that g^r = Qaiir) for every permutation cr of [n] . Less obvious, 
but still true, is that the law of LI is determined by the jump rates {q-„ : vr G 'Pn\{'^[n]}, n G N}. 
Furthermore, there is a nice description of these rates. For vr G P„ and n' G {n, n + 1, . . . , oo}, 
define 

Vn',!! = W G Vn' ■ Vr|j„] = Tt}. 

Proposition 2 (Propositions 3.2 and 3.3 in [3]). Suppose we have a family {g^ : n E Vn\ 
{![„]}, 72 G N}. This family is the family of jump rates of some homogeneous fragmentation 
n if and only if there is an exchangeable measure ^ on Voo satisfying 

(1) /^({In}) = and, 

(2) /i({7r G Voo ■ 7r|[„] 7^ 1[„]}) < 00 for every n>2, 

such that /i('Poo,7r) = Q-K- Furthermore, this correspondence is bijective, and we call fi the 
splitting rate ofU. 

We are after a Levy-Ito decomposition of /i. For n G N, let e'-"^ be the partiton of N with 
exactly two blocks, {n} and N \ {n} and define 

00 

e = y^^,(n). 

n=l 

For a measure u on such that z/({l}) 
on Voo by 

pA-) - 

Theorem 2 (Theorem 3.1 in [3]). Let n be the splitting rate of a homogeneous fragmentation. 
Then there exists a unique c > and a unique measure v on with z/({l}) = and 
/^^(l — Si)z/(c/s) < 00, such that 

H = ce + pu- 



and jgiiX ~ si)i'{ds) < 00, define a measure pp 
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The interpretation of this is that c is the erosion coefficient, i.e. the rate at which mass is 
lost contionously, and v is the dislocation measure, i.e. it measures the rate of macroscopic 
fragmentation. From this theorem, it is clear that every homogeneous fragmentation process 
is characterized by the pair (c, v). Given a homogeneous fragmentation 11° (t) with parameters 
(0, z/), and a < 0, we can construct an a-self-similar fragmentation with parameters (a, 0, v) 
by a time change. Let vr*(t) be the block of 11° that contains i at time t and define 



For t > 0, let n(t) be the partition such that i, j are in the same block of n(t) if and only 
if they are in the same block of n°(Ti(t)). Then (n(t),t > 0) is a self-similar fragmentation 
with characteristics (a,0,z/). See [2] for details. 

We will need trees associated to fragmentations with characteristics (a, 0, z/), where a < 
and z^(X]j Si < V) = 0. What these assumptions tell us is that there is no continuous 
loss of mass due to erosion (c = 0), mass it not lost during macroscopic fragmentations 
(z/(^. Sj < 1) = 0), and the fragmentation eventually becomes the partition into singletons 
(Proposition 2 in [2], the rate of convergence is give in Proposition 14 in [5]). Henceforth, 
we let n be such a self-similar fragmentation. 

The tree associated a fragmentation processes 11 is a continuum random tree that keeps 
track of when blocks split apart and the sizes of the resulting blocks. For a continuum tree 
(T, /i) and t > 0, let Ti(t),T2(t), ... be the tree components of {f G T : ht(f) > if:}, ranked 
in decreasing order of //-mass. We call (T, /i) self-similar with index a < if for every t > 0, 
conditionally on (/i(Tj(t)), i > 1), {Ti{t),i > 1) has the same law as {fi{Ti{t)y'^T'-'\t),i > 1) 
where the T*^*) 's are independent copies of T. 

The following summarizes the parts of Theorem 1 and Lemma 5 in [6J that we will need. 

Theorem 3. Let U be a {ajO,^)- self- similar fragmentation with a < and v as above and 
let F := |n|^ be its ranked sequence of asymptotic frequencies. There exists an a-self- similar 
CRT {T_a,u, f^-a,u) such that, writing F'{t) for the decreasing sequence of masses of the 
connected components of {v G T-a,u '■ ht{v) > t}, the process {F'{t),t > 0) has the same law 
as F. Furthermore, Tp is a.s. compact. 

The choice of where to put negative signs in the notation in the above theorem is to 
conform with the notation of [7]. The Brownian CRT is the —1/2-self- similar random tree 
with dislocation measure 1^2 given by 



Since we will alway have c = 0, we will drop it and for a measure v satisfying the above 
conditions and 7 > 0, we refer to (—7, v) as fragmentation pair, which is associated to a 
(—7, z/)-self-similar fragmentation. 

2.5. Convergence of Markov branching trees. We ffist recall some of the main results 
of [7]. Let A C Z+ contain and let be a compatible sequence of probability measures 
satisfying the conditions of Section 12.21 Define g„ to be the push forward of g„ onto by 





dsif{si, 1 



si,0,0,...). 
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Theorem 4 (Theorems 1 and 2 in [7]). Suppose that A = {0} or A = Z+. Further suppose 
that there is a fragmentation pair (—7, z/) with < 7 < 1 and a function i : (0, 00) — >■ (0, 00), 
slowly varying at 00 ('or 7 = 1 and i{n) 0) such that, in the sense of weak convergence of 
finite measures on S^, we have 

n'^£{n){l — Si)qn{ds) — > (1 — Si)iy{ds). 

Let Tn have law and view T„ as a random element of Aiw with the graph distance and 
the uniform probability measure fidAT„ on OaTu = {f G T„ : deg v E A}. Then we have the 
convergence in distribution 

^ T -A T 



with respect to the rooted Gromov-Hausdorff-Prokhorov topology. 

The case where A = {0} this is a special case of Theorem 1 in [7] and the case A = Z+ 
is Theorem 2 in the same paper. The case A = Z+ is proved by reduction to the A = {0} 
case. We extend this to the case of general A containing 0. 

Theorem 5. The conclusions of Theorem^ are valid if the only assumption on A (1 Z+ is 
that e A. 

As argued at the start of Section 4 in [7], we may assume that gi(0) = 1. This is because 
each leaf is connected to the rest of the tree by a stalk of vertices with out-degree one and 
geometric length. Setting gi(0) = 1 collapses these to be length one. Since these stalks are 
independent from one another, with probability approaching one, this costs log(?T,) in the 
Gromov-HausdorfF-Prokhorov metric. This is negligible since we are scaling by {n'^i{n))~^. 
Let t be a rooted unordered tree with n vertices whose out-degree is in A and let t° be the 
tree obtained from t by attaching a leaf to every non-leaf vertex of t whose out-degree is in 
A. 

Define the inclusion l : Vn-i — ?• Vn by l{X) = (A,l). We now define a sequence g° of 
probability measures on Vn- Define ^1(0) = 1 and for n >2, 



fg„(A) if AeP;f \6(P„^nP„_i), 

gn(A) + g„(A') if A e P;f and A = i(A') for some X' eV^n 

qn{X') if A ^ P;;^ and A = t(A') for some X' e n Pn-u 

otherwise. 



Lemma 1. //T„ has distribution then T° has distribution P^°. 

Proof. We prove this by induction. The result is clear for n = 1. For > 2, we condition on 
the root partition. Indeed, since in both T° and a tree with law P^ the subtrees attached 
to the root are independent given the root partition, by induction (and a little care about 
when the partition at the root is (n) ) , we need only check that the laws of the partitions at 
the root agree. This, however, is immediate from the construction of g°. □ 



Therefore Theorem O is an immediate consequence of the following lemma. 



Lemma 2. // 

then 
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n'^i{n){l — si)qn{ds) — (1 — Si)iy{ds), 



n^i{n){l - si)q°ids) ^ (1 - Si)z/(rfs). 



Proof. Let f : ^ he Lipschitz continuous (with respect to the uniform norm) with 
both the uniform norm and Lipschitz constant bounded by K. Observe that for A G Pn, 



/ 



.(A) 



/ 



n + 1 

Letting g{s) = (1 — si)/(s), we have 



P(A) 



< 



A, 



+ 



K 



2K 



^ n(n +1) n+1 n+1 

i=l 

.(A) 



n 
2K 



Ai 
n — 1 



f 



n — 1 



< 



3K 



n{n + 1) n + 1 



n + 1 



Multiplying by n'^£{n), we see that this upper bound goes to and the result follows. 
Proof of Theoreml5[ Note that, if a > 0, then dQi{p{at, at°) < a. Consequently 



f^GHP 



0. 



□ 



Since {n'^i{n)) — )■ T^^^ by Lemma [2] and Theorem IH {n'^£{n)) ^Tn T^^u as well. □ 



3. Galton- Watson trees 

Let = {^i)i>o be a probability distribution with mean less than or equal to 1, and assume 
that ^1 < 1. A Galton- Watson tree with offspring distribution ^ is a random element T of 
T with law 

GWg(t) = P(r = t) = Y[deg{v). 

The fact that ^ has mean less than or equal to 1 implies that the right hand side defines an 
honest probability distribution of T. 

3.1. Otter-Dwass type formulae. In this section we develop a transformation of rooted 
ordered trees that takes Galton- Watson trees to Galton- Watson trees. This transformation is 
motivated by the observation that the number of leaves in a Galton- Watson tree is distributed 
like the progeny of a Galton- Watson tree with a related offspring distribution. This simple 
observation was first made in [llj. Let ^ = (^i)i>o be a probability distribution with mean 
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less than or equal to 1, and assume that < 1. Let T be a Galton- Watson tree with 
offspring distribution ^ and let 



l)Z 
i=l 

be the probability generating function of the number of leaves of T. Furthermore, let 

oo 

Decomposing by the root degree, we see that C{z) satisfies the functional equation 



<j>{C{z)) 

^0 



Solving for C{z) yields 

(3.1) C(z)=; 
Define 

(3.2) eiz, 

1 - (j){z) 

Observe that 9 has nonnegative coefficients, [2;°]6'(z) = ^o/(l ~ ^i) and ^(1) = 1. Thus the 
coefficients of 6 are a probability distribution, call it ^ = (Ci)i>o- 

Proposition 3. Let T be a Galton- Watson tree with offspring distribution ^ and let T' be a 
Galton- Watson tree with offspring distribution C where ^ and ( are related as above. Then 
for all k > 1, P(#{o}T = k) = P(#g+T' = k). Also, T' is critical (subcritical) if and only if 
T is critical (subcritical). 

Proof. The computations above show that the probability generating functions for #{o}^ 
and a^i+T' satisfy the same functional equation and the Lagrange inversion formula implies 
they have the same coefficients. The criticality claims follows from Equation (13.21) . which 
can also be used to obtain higher moments of C. □ 

Corollary 1. Let be an ordered forest of n independent Galton-Watson trees all with 
offspring distribution ^. Let C be related to C, as in Proposition^^ Let (Xj)j>i be an i.i.d. 
sequence of ( distributed random variables and let Sk = X]i=i("^« ~ "^{ojJ^n denote 

the number of leaves in Tn. Then for 1 < k < n 

n^mJ'n = k) = -nSk = -n). 
n 

Proof. This follows immediately from Proposition |3] and the Otter-Dwass formula (see |12j). 

□ 

This relationship between T and T' can also be proved in a more probabilistic fashion. 
Indeed, by taking a more probabilistic approach we can get a more general result that 
includes the results in [Tl] as a special case. To prove the result in full generality, it is more 
convenient to work with the depth-first queue of T than with T itself. 
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For X G = Z'f^'^'^'"' J', let r_i = inf {n : Yl^=i^n = Let ^ be the set of sequences 

of excursion increments in that are bounded below by —1. Formally, 

& = [x eZ^ : Xi > -1 for i > 1, = for i > t_i{x), r_i(x) < oo} . 

For t eT with n vertices, index the vertices V oi t from 1 to n by order of appearance on 
the depth-first walk of t. Define DQ{t) = {DQk{t))'^^Q by DQkit) = deg Vk — lioTk<n and 
for k > n, which are the increments of the depth-first queue of t. Note that DQ{t) G ^0. 
Furthermore t i— )■ DQ{t) is a bijection from T to ^ (see e.g. [T^). 

Let Tin be the projection onto the n'th coordinate of Z^ and let = aijik, k < n). Let 
On be the shift {6nx){i) = x{n + i). Let N' be a stopping time with respect to Let 
X° = and for i > 1 define = N'-^ + {N' A r_i) o ^jv-i- Define x by 



x{k) 



E2vfc-i+i ^(0 if iV'' < oo 



liN" = oo. 
Proposition 4. If x E 3, then x G ^. 

Proof. The only non-trivial part is to see that for each x G ^ there exists /c such that 
N'^ = r_i. Clearly N'^ < r_i. Let k = max{i : < t_i}. Suppose, for the sake of 

contradiction, that < r_i. We then have that Xlili ^(0 — O5 so J2'i=Nk_^_iX{i) < —1, so 
jyfc+i ^ ^^^^ which is our contradiction. □ 

Combining these ideas, we obtain the following theorem. 

Theorem 6. Let ^ he a probability distribution on Z+ with < ^0 < 1- Suppose that T 
is a Galton-Watson tree with ofjsping distribution ^. Let N' be a stopping time and and 

let T be the tree determined by DQ{T) by the bijection above. Let X = {Xi,X2, . . . ) be a 
vector with i.i.d. entries distributed like P{Xi = k) = ^k+i- Then T is a Galton-Watson tree 

whose offspring distribution is the law of 1 + Xi. Furthermore, i/E|Xi| < 00 and 

E{N' A r_i) < 00 then 




l+EXiE(iV'Ar_i), 
and if, additionally, KXi = (i.e. T is critical) and Var{Xi) = < 00, then 

Far I 1 + y^Xi\= a^E{N' A r_i). 




Proof. Define R{X) to be the vector with R{X)k = Xkl{k < r_i(X)). It is well known 
that DQ{T) =d R{X) and that the vectors {{Xj^i^i, . . . ,X^i+i)}°lQ are i.i.d. Consequently 
X is the vector of increments of a random walk with jump distribution given by the law of 

Ei'i^^ Xi- Observing that DQ{T) =d R{X) = R{X) shows that f is a Galton-Watson tree 
with the appropriate offspring distribution. The last claims follow from Wald's equations. □ 
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Let us give a specific example of liow tfie general tlieorem above may be applied. For a 
nonempty subset A of Z+ let be the set of paths in ^ with exactly n terms no later than 
T-i{x) in A — 1. When A = Z"*", we just write Note that for every n > 1, t ^ DQ{t) 
restricts to a bijection from T^.n to We obtain the following plethora of Otter-Dwass 
type formulae. 

Corollary 2. Fix A C such that E A and define N'(x) = inf{i : Xj + 1 G A}. Define 

T, T, and X as in Theorem\B and let Xi, X2, . . . be i.i.d. distributed like 1 + ^f^i^^ Xi. 
Then 



[#^r = n) = p(r = n) = h> [ri^^i = -1^ ■ 



The corresponding result for forests also holds. Furthermore, if T is critical with variance 
< < 00, then Var{Xi) = a^/^iA). 

Proof. This follows from the observation that, with this N', x E S>:^ if and only if a; G 
The formula for the variance follows from the fact that N' is geometric with parameter 

m- □ 

We note that, in the context of Corollary |21 the same construction can be done directly on 
the trees without first passing to the depth-first queue, though setting up the formalism for 
the proof and the proof itself are slightly more involved. The idea is a lifeline construction. 
You proceed around the tree in the order of the depth-first walk and when you encounter a 
vertex whose degree is in A you label the edges and vertices on the path from the vertex to 
the root that are not yet labeled by that vertex. This labeled path can be considered the 
lifeline of the vertex. A new tree is constructed by letting the root be first vertex encountered 
whose degree is in A and attaching vertices whose degree is in A to the earliest vertex whose 
lifeline touches its own. Going through the details of this helps make this transformation 
more concrete, so the case oi A = {0} is included below. 

Suppose that t G 7{o},n and label the leaves by the order they appear in the depth-first 
walk of t. We will now color all of the edges of t. Color every edge on the path from leaf 1 
to the root with 1. Continuing in increasing order of their labels, color all edges on the path 
from leaf i to the root that are not colored with an element of {1, 2, . . . , i — 1} with z, until 
all edges are colored. Note that for any 1 <k <n the subtree spanned by leaves {1, . . . , fc} 
is colored by {1, . . . , fc} and an edge is colored by an element in {1, . . . , fc} if and only if it is 
in this subtree. Furthermore, the path from leaf k to any edge colored k contains only edges 
colored k. See Figured] for an example of such a coloring. Call two edges of t coincident if 
they share a common vertex. 

Lemma 3. If t is colored as above and 2 < j < n, then there is exactly one edge colored j 
that is coincident to an edge with a smaller color. 

Proof. First we show existence. Consider the path from leaf j to the root. Let e be the 
last edge this path that is not contained in the subtree spanned by leaves {1, . . . , j — 1}. 
By construction this edge is colored j and is coincident to an edge colored by an element of 
{1,...,J-1}. 
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To see uniqueness, suppose that / is an edge with the desired properties. Then / is on 
the path from j to the root and / is coincident to an edge in the subtree spanned by leaves 
{1, . . . ,j — 1}. If / contains the root, then / is the last edge on the path from j to the root 
that is colored j, i.e. / = e. Otherwise, after /, we finish the path from j to the root within 
this subtree. Hence / is the last edge on the path from j to the root that is colored j and 
again / = e. □ 

With t labeled as above we define a rooted plane tree with n vertices, called the life-line 
tree and denoted t, as follows. The vertex set of t is {1, 2, . . . , n}, 1 is the root. Furthermore, 
if i < j, i is adjacent to j if i is the smallest number such that there exist coincident edges 
Ci, 62 in t with ei colored i and 62 colored j. Finally, the children of a vertex are ordered by 
the appearance of the corresponding leaves in the depth-first search of t. See Figured] for an 
example of this map. 



Lemma 4. The life-line tree is a tree. 

Proof. We must show that t is connected and acyclic. Suppose that t has at least two 
components. Let j be the smallest vertex not in the same component as 1. By Lemma |3l 
there exists I < i < j and coincident edges ci, 62 in t labeled i and j respectively. Thus i is 
adjacent to j, a contradiction. 

Suppose that i contains a cycle. Let j be the largest vertex in this cycle. Then j is 
adjacent to two smaller vertices, contradicting our definition of t. □ 

Let (fi, . . . ,Vk) be the list of vertices (ordered by order of appearance in the depth-first 
walk of t) in t that are children of vertices on the path from 1 to the root, but not actually 
on that path themselves. Let t^^ be the plane subtree of t above fj. It is then easily verified 
that t is obtained by joining the trees (t^j , . . . , t^,^,) to a common root with their natural order 
(and renaming the vertices as they appear in the depth first walk). 




V 




Figure 1 . A colored tree and its image under V 
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Lemma 5. Let T be a Galton-Watson tree with offspring distribution ^. Condition T on the 
event that the first leaf on the depth-first walk of T has height n and that there are exactly 
k vertices in T that are children of vertices on the path from the root to the first leaf on the 
depth-first walk that are not on this path themselves. Let w • • • , be these vertices (again in 
order of appearance) and let T^n be the plane subtree ofT above v^. The collection {Ty^^^^-^ 
is a collection of i.i.d Galton-Watson trees with common distribution T. 

Proof. Let ti, . . . ,tk be rooted ordered trees. Let A be the set of trees t such that the first 
leaf on the depth- first walk of t has height n and that there are exactly k vertices Vi, . . . ,Vk 
(listed in order of appearance on the depth-first walk) in t that are children of vertices on 
the path from the root to the first leaf on the depth-first walk that are not on this path 
themselves. Let 5 C A be the set of trees such that tj is the tree above vj for all 1 < j < k. 
Let C be the set sequences that appear as the sequence of degrees of vertices on the path 
from the root to the left-most leaf of a tree in A. Note that 

n+1 

F{TeA)= Yl U^y 

since, given {yi, . . . ,yn+i) G C, there are k places to attach trees to the path from the root 
to the left-most leaf, and we sum over all ways of doing this. Consequently, we have 

P(% = t„ 1 < j < fc) = ^ ^p(r = t) 

^ n+1 k 

{yi,...,y„+i)€C «=1 1=1 

k 

= l[F{T = t,). 

i=l 

□ 

Theorem 7. Let T be a Galton- Watson tree with offspring distribution C, and define T and 
T' as above (see Proposition\^forT'). Then T = T' . 

Proof. Let t be a rooted plane tree and consider P(T = t). If t has one vertex, it is clear that 
P(T = t) = P(T' = t). Suppose that the result is true for all trees with less than n vertices, 
and suppose that t has n vertices. Let ti, . . . ,tk be the subtrees of t attached to the root of 
t, listed in order of appearance of the depth-first walk of t. Let Ai^k be the event that the 
first leaf to appear on the depth-first walk of T has height i and that there are k vertices in 
T that are children of vertices on the path from the root to the first leaf on the depth-first 
walk that are not on this path themselves. Let vl, . . . ,vl he these vertices (again in order of 
appearance) and let Tyi be the plane subtree of T above Vj. Lemma [5] shows that, for fixed 
i, conditionally on Ai k, the T i are i.i.d. distributed like T. From our discussion above, we 
have that conditionally on Aj ^, T = t if and only if T„i = tj for all j. Since tj has fewer 
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than n vertices, the inductive hypothesis imphes 

oo 

P(f = t) = ^ P(r G A,fc)P(f,. = for all J I T G A 



i=l 

k 



\j=l J \i=l 

Hence it remains to show that 

oo 
i=l 

Let (Xj)^Q be i.i.d. distributed like ^. We then have 

P(T G A^k) = P = 0, J2iX, - 1) = A;, X, - 1 > for 1 < J < z j = 
where for a power series ^'(-z), [z'^]4' is the coefficient of z^. Thus we have 

oo oo ^ 

i=l i=0 '^^ ' 

where the interchange of limits is justified by positivity of the coefficients involved and we 
may start the second sum at since k>l. □ 

3.2. The partition at the root. Let ^ = {$,i)i>o be a probability distribution with mean 
1 and variance < o"^ < oo. Let T be a Galton- Watson tree with offspring distribution C, 
(denote the law of T by GW^). Let A C Z,"*" contain and construct T as in Corollary El 
Then, by Theorem [6], T is a Galton- Watson tree. Let ( be its offspring distribution. Again 
by Theorem [6], ( has mean 1 and variance cr^ = af/C,{A). Assume that for sufficiently large 
n, ¥{^aT = n) > 0. Let be T conditioned to have exactly n vertices with out-degree in 
A (whenever this conditioning makes sense). 

For a t be rooted unordered tree with exactly n vertices with out-degree in A, let n^(t) 
be the partition of n or n — 1 (depending on whether or not the degree of the root of t is in 
A) defined by the number of vertices with out-degree in A in the subtrees of t attached to 
the root. 

Lemma 6. (i) Considered as an unordered tree, the law of is equal to where, for 
n >2 such that is defined, and A = (Ai, . . . , Ap) G , we have 

g„(A) = m\T-) = A) = ^ ^.M p)^-'^^*'^ = 



n,>im,(A)!^^^^ P(#AT = n) ' 
(ii) Let Xi, X2, ... he i.i.d. distributed like F{^aT = n) and Tk = Xi + ■ ■ ■ + Xk. We have 



(ri = n) 
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and F(Il^(T^) E ■ \ {p(Jl'^(T^)) = p}) is the law of a non-increasing rearrangement of 
{Xi, . . . , Xp) conditionally on Xi + ■ ■ ■ + Xp = n — l{p E A) . 

Proof, (i) Letting cq(T^) be the root degree of and ai, . . . , e N with sum n — l{pEA) 
we have 

(3.3) P(c0(r„^) = p, #a[(T„^).] = a„ 1 < . < p) = e(p)0^=^ ^ ""'^ 



(#aT = n) 



Part (i) now follows by considering the number of sequences {ai, . . . ,ap) with the same 
decreasing rearrangement. 

(ii) This follows from Equation (13. 3p . □ 

To simplify notation, let g„ be the law of n^(T^) and let Ip = l{p G A). Let {Sr,r > 0) 
be a random walk with step distribution (Ci+i,* > —1)- By Corollary [2], we have 

(3.4) g„(p(A) =p)= m lp(g^^_i) = ^^^(P) P(g„ = -1) ' 

where ^(p) = p^{p) is the size-biased distribution of ^. 

Define g„ to be the pushfoward of g„ onto by the map A i— )■ A/ Aj. 

For a sequence {xi,X2, ■ ■ ■) of non-negative numbers such that < ^* tie a 

random variable with 

Fit* = 



The random variable X q^* IS called a size-biased pick from (xi, X2, . . . ). Given i*, we re- 
move the i*'th entry from (xi, X2, ■ ■ ■) and repeat the process. This yields a random re-ording 
{xl, X2, ■ ■ ■) of (xi, X2, . . . ) called the size-biased order (if ever no positive terms remain, the 
rest of the size-biased elements are 0). Similarly for a random sequence {Xi, X2, . . .) we 
define the size-biased ordering by first conditioning on the value of the sequence. For any 
finite measure /i on S^, define the size-biased distribution fj,* of fj, by 



/i*(/) = / /x(rfs)E[/(s*)], 



where s* is the size-biased reordering of s. 
Define the measure on by 



1 



z/2(cis)/(s) = / J _ 3 _ — -cisi/(si,l-si,0,0, ...). 



Theorem 8. With the notation above, 



lim n^/\l - si)qn{ds) = ^lv2M(i _ s,)u2{dds), 



2 

where the limit is taken in the sense of weak convergence of finite measures. 
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Proof. We follow the reductions in Section 5.1 of [7J. By Lemma 16 in [7] (which is a easy 
variation of Proposition 2.3 in [3]) it is sufficient to show that 



hm n'/\{l - si)q4ds)r = _ si)z/2(rfds))*. 

n— >oo Z 

Note that for any finite non-negative measure fi on iS^ and non-negative continuous function 
/ : iSi — i- M we have 

{{l-si)fi{ds)y{f)= [ /i*(tix)(l-maxx)/(x). 

Jsi 

Consequently the theorem follows from the following Proposition. □ 
Proposition 5. Let / : 5i — t- M 6e continuous and let g(x.) = (1 — maxx)/(x). Then 



First note that, by linearity, we may assume that / > and ||/||oo < 1- We begin the 
proof of this Proposition with several Lemmas regarding the concentration of mass of g*. We 
also note that for the remainder of this section we are following Section 5.1 in [7J very closely 
with minor differences to account for our more general setting and we invoke Corollary [2] 
rather than the Otter-Dwass formula - and we get different intermediate constants than they 
get, but the end results are the same. Nonetheless, the full computation is worth including 
because it makes clear why the factor of \JS,{A) appears in the scaling limit. 

Lemma 7. For every e > 0, \/nqn{p{X) > e-\/n) — ?■ as n — )■ oo. 

Proof. Observe that we have the local limit theorem (see e.g. Theorem 3.5.2 in 



(3.5) lim sup 



0. 



Using this and Equation (13. 4p we have gn(p(A) = p) < Ci{p) for some C independent of 
n and p. Since ^ has finite variance, ^ has finite mean so ^((fc, oo)) = o(A;~^). The result 
follows. □ 

Note that an immediate consequence of the local limit theorem is that for any fixed A; > 
we have 



lim sup 

l<p<fcnl/2 



a 



v^exp = -p)-l 



0, 



and this is often the result we are really using when we cite the local limit theorem. 
Lemma 8. For g as in Proposition\^we have 

limlimsupv/ng;(|^|l{^,>i_^}) = and lim \fnql{l,^^^^-7/^) = 0. 
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Proof. Fix T] > 0. Since we are assuming ||/||oo < 1, we know that |5'(x)| < (1 — Xi). 
Observing that 



P(X* = m\Xi + --- + Xp = n) 



pm P(Xi = m)P(X2 H \- Xp = n - m) 



n P(Xi + --- + Xp = n) 

and using (ii) in Lemma E] we see that A/ng*(|5'|l{j;j>i_^}) is bounded above by o(l) plus 

where the o(l) term is justified by Lemma [7] and our restriction to 1 < p < n^/^. Observe 
that Equation (I3.5P imphes that 

P(ri =n) = -r{Sn = -1) 



Using Corollary m and again that gn(p(A) = p) < Ci{p) for some C independent of n and 
p, we find that for large n, y/nq!^{\g\l^xi>i-r)}) is bounded above by o(l) plus 

l<p<nl/2 

^ _ "^1 \ mi ^-3/2 ^ - Ip P(5'„_m,_ij, = + 1) 



E 

(1— 7j)n<mi <n— Ip 



n-lp/n-lp ^ n-mi-lp P(5'„_ip = -p) 



Simplifying we get 

1 ^ rirF{Sn-m,^i, = -p + 1) 



V^q:{\9\l{.,>i-,y) < o{l) + C J2 P'^(Py E 

l<p<ni/2 (l—r})n<m 



n , ^ V "^1 Pl'S'n-ip = -V) 

i<n-lp 



Equation fl3.5p implies that, for 1 < p < n"!'^ , ^Jr)F{Sn-\^ = —p) and ^Jn — mi¥{Sn-ip = 
—p + 1) are bounded below and above respectively for some constants independent of n and 
p. Hence we have 



v^c(i^|i{xi>i-.})<o(i)+c E p'^^p)- E 



?T. / jTti / 1 mi \ 
l<p<nl/2 (l-77)n<mi<n-lp W — (^i 



oo 

<o(i)+cEp'e(p)- E 



n / mi ( -I mi \ 
P=l {l-»7)n<mi<n-lp — \l —) 

Note that the mi = n term has been absorbed into the o(l) term. The upper bound converges 
to C J^_^^(x{l — x))~^^'^dx, which goes to and ?7 — )■ 0. A little bit of care must be taken 
here since the integral is improper as a Riemann integral, however this is fine since the sums 
actually under approximate the integral in this case. 
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The second limit can be proved in a similar fashion. Note that y/nq*{l^^^^^-7/8y) is 
bounded above by 

1/2 f f\\ \ \^ P(Xi = mi)P(rp_i = n - mi - Ip) 
n/ J2 <ln{p{X)=p) E ^ ^ ^ _ + 0(1) 

l<p<ni/2 mi<ni/8 P ^ P 

l<p<nl/2 l<mi<ni/8 ^ P P' 

<Cn-V4 ^ p2^(p) + o(l), 

l<p<ni/2 

where the last step is justified by the local limit theorem. □ 
Lemma 9. For every t] > we have 

lim v^g;(l{^^+^.2<i_^}) = 0. 

n— >-oo 

Proof. Fix < e < 1. Up to addition by an o(l) term depending on e we have that 
y/nq^{l{xi+x2<i-v}) is bounded above by 

l<p<eral/2 

^ pmi {p - l)m2 P(Xi = mi)P(X2 = m2)P(rp_2 = n - mi - 1712 - Ip) 

^ n — Ir, n — mi — 1„ Pfro = n — !„) ' 

where mi, m2 > 1. If mi + m2 < (1 — r])n then n — mi — m2 > rin and, in particular, the 
quantity on the left goes to 00 as n does. Consequently Corollary |5] and Equation f l3.5p imply 
that 

IP(Tp-2 = n - mi - m2 - Ip) ^ C 



Tp = n- Ip) 773/2 ■ 



for C independent of 1 < p < en^^"^. Our assumption that e < 1 implies that C is independent 
of e as well. Again using that n'^/^P(Xi = n) is bounded we have 

Vnq*{i{x,+x2<i-ri}) < 0(1) + ^ V p^c(p)^ V \r^\r^ 

l<p<eni/2 mi+m2<(l-r))n * ^ 

7 p^i Jo Jo v^y 

Taking the lim sup as n — )■ 00 and then letting e — > yields the result. □ 
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Lemma 10. There exists a function = o{ri) as r] ^0 such that 

limliminf yng;;(^l{^,<i_^,^,+^2>i_/3,}) = lim lim sup Trig; (^l{^i<i_^,^i+^2>i_;3 }) 



^iVOA) giix,l-x,0,...)) 
2^ J, a;V2(i_a;)3/2 



dx. 



Proof. Fix ?7 > and suppose that rj' G (O,?]). Using Lemmas [7] and E] we decompose 
according to the events {p{\) > £\/n} and {x : xi < n~'^^^} to get 

(3.6) ^ql{gl{^^<i_r,,xi+x2>i-v'}) = + ^ QnipW = p) 

l<p<eral/2 

^ E[g{{mi,m2,X^, . . . ,X*,0, . . .)/{n-lp))\Tp = n-lp,X^ = mi,X^ = 7712] 



X 



nl/*<mi<(l->7)(n-lp) 
(l-77')(n — lp)<mj+m2<n— Ip 



pnii 



(p-l)m2 



n— 1 



n — Ir, 1 , 
n— 1 



^P(Xi = mi)P(X2 = m2)- 



[rp_2 = n - mi - m2 - 
P(rp = n - Ip) 



Observe that, if 1 > xi + 0:2 > 1 — and Xi < 1 — r/, then X2/{1 — Xi) > 1 — rj'/ri and 
1 -Xi)/X2 > 1. 

Using the local limit theorem and we observe that 



sup 

l<p<eni/2 



aV2^exp (^^) = ~p) - av^exp [1^^ P(^n 







Consequently, 



sup 

l<p<enl/2 



gn(p(A)=p) 



sup 

l<p<ni/2 



V27r(T2nexp(pV2r2o-)(r2 - lp)P(5n = -p) ^ 
n exp(l/2ncT)nP(S'n = —1) 



^ 0. 



Thus, for sufficiently large n and small e, we have 

for all 1 < p < en^/2. Using the local limit theorem and Corollary [2] we have 

exp 



sup 

l<p<eni/2 



P 



2ncT2 



P(rp = n)-l 



0. 



Thus, for sufficiently large n and small e, we have 



for all 1 < p < en^/^. We note in particular that ri = Xi =d X2. Furthermore, for 
^1/8 < rni < (1 — ri)n and mi + m2 > (1 — r]')^ we have m2 > {rj — r]')n so that mi and m2 
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go to infinity as n does. Thus, for large n (how large now depends on 77') we have 

< (mim2)^/^P(Xi = mi)P(X2 = m^) < ^V^- 

Now, recall that / is uniformly continuous on Si. Furthermore, on the set {x G 5i : 
Xi + X2 > 3/4} we have maxx = xi V X2 and x 1— >■ maxx is thus uniformly continuous on 
this set. Therefore for 77' < (1/4) A 77^ sufficiently small we have 

\g{{mi, ma, mg, . . . )/n) - g{{mi, - mi, 0, . . . )/n)\ < rj, 

for every (mi, m2, . . . ) with sum n sufficiently large such that mi + 1712 > (1 — ri')n. Take 
13, := V'- 

Given the symmetry of the bounds we have just established it is easy to see that the proofs 
for the lim sup and lim inf will be nearly identical, one using the upper bounds and the other 
the lower. We will only write down the proof for the lim inf. For sufficiently large n we have 
that, up to addition of an o(l) term, y/nq^{gl^^^^i__jj^x-^^x2>i-r}'}) bounded below by 

l<p<en^/^ 

X XI igiimi,n-mi-lp,0,...)/{n-lp)) -7]) 

ni/8<mi<(l--^)(n-lp) 

1 1 1 

X 



(mi/(n - lp))i/2 (1 _ mi/{n - 

X X^ P(Tp-2 = n — mi — m2 — 1 

(1— »;')(n— Ip)— r?ii<r?i2<n— mi — Ip 



Observe that this last sum is equal to X]m=o~^^'' = '"^)- By the local limit theorem, 
this can be made arbitrarily close to 1 independent of 1 < p < n^l"^ . Using the convergence 
of Riemann sums (again care must be taken since the integral we get is improper), we have 

lim inf A/rag; (5(l{^^<i_^,^,+^2>i_^/}) 

n— s>oo 

> ^^-^^'^^-^'1^^ y^p _ i)e(p) ^ (,(x, i-xA...)-v) 

l + V jr[ Jo av^xi/2(l-a;)3/2^^^ ^ ^ J U 

Letting t] ^ coupled with observing that J2'^=iiP ~ l)^(p) = ^1 recalling that = 
a1/^{A) completes the proof. □ 

Proof of Proposition O Observe that 

\Qni9) - qni9'^{xi<l-r,,xi+X2>l-'rj'})\ < C ( IS' I l{xi>l~»?}) + ^ ( Ifi' I l{xi+a;2<l-r,'}) ■ 

Fix e > and apply Lemmas [HI and [TO] to find 77,77' such that 

VnQ*n{\9\Mxi>i-v}) < 7; 
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and 
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{xi<l— ?7,xi+X2>l— A 



.}) - 



a^^i(A) g{x,l-x,0,0, 



xl/2(l-x)3/2 



-fix 



e 

^2- 



for large enough n. For this choice of t], rj' and large n we have 

< e + q*rii\9\Mx,+X2<l-v'})- 



Quia) - 



'2tx 



-dx 



□ 



XV2(1 _x)3/2 

By Lemma [9] the upper bound goes to e as n — > oo, and the result follows 

As an immediate corollary of these results, we also identify the unnormalized limit of 

Corollary 3. g„ 4 5(i,o,o,...)- 

Proof. Taking / = 1 in Proposition gives g„(l — si) — > 0. Thus g„(si) — )■ 1. Suppose 
that there exists < < 1 such that < r^) 0. Let e = limsup^.^^^ g„(l{si<77}) > 0. 

Observing that 

gn(si) = g^«(sil{si<^}) + g„(sil{^i<^}) 
< 1 - (1 -r7)g„(si < r/), 

we see that 

liminf gri(si) < 1 — (1 — < 1, 

n— >oo 

a contradiction. Thus g„(l{si<,,}) — > for all < < 1 and, consequently, qn{si > rf) ^ 
1. □ 

Note that, as a consequence of Equation (13 .4^ . we have gn(p(A) = p)) — )■ ^(p). Thus, while 
the degree of the root vertex may be large, only one of the trees attached to the root will 
have noticeable size. 

3.3. Convergence of Galton- Watson trees. We are now prepared to prove Theorem [H 
which, after all of our work above, is a rather straight forward. 

Proof. Lemma [6] shows that has law for a particular choice of (g„)„>i. Theorem [8] 
then shows that the hypotheses of Theorem [5] are satisfied. □ 
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